Only mark all mon updates complete if there are no blocked updates #3907

TheBlueMatt · 2025-07-03T01:31:26Z

In handle_new_monitor_update!, we correctly check that the channel doesn't have any blocked monitor updates pending before calling handle_monitor_update_completion! (which calls Channel::monitor_updating_restored, which in turn assumes that all generated ChannelMonitorUpdates, including blocked ones, have completed).

We, however, did not do the same check at several other places where we called handle_monitor_update_completion!. Specifically, after a monitor update completes during reload (processed via a BackgroundEvent or when monitor update completes async, we didn't check if there were any blocked monitor updates before completing).

Here we add the missing check, as well as an assertion in Channel::monitor_updating_restored.

ldk-reviews-bot · 2025-07-03T01:31:29Z

I've assigned @wpaulino as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

lightning/src/ln/quiescence_tests.rs

codecov · 2025-07-03T01:46:32Z

Codecov Report

Attention: Patch coverage is 97.05882% with 2 lines in your changes missing coverage. Please review.

Project coverage is 88.80%. Comparing base (257ebad) to head (ab0dda7).

Files with missing lines	Patch %	Lines
lightning/src/ln/chanmon_update_fail_tests.rs	97.95%	0 Missing and 1 partial ⚠️
lightning/src/ln/quiescence_tests.rs	91.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3907      +/-   ##
==========================================
- Coverage   88.82%   88.80%   -0.02%     
==========================================
  Files         165      165              
  Lines      119075   119123      +48     
  Branches   119075   119123      +48     
==========================================
+ Hits       105769   105790      +21     
- Misses      10986    11003      +17     
- Partials     2320     2330      +10

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ldk-reviews-bot · 2025-07-05T01:42:14Z

🔔 1st Reminder

Hey @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2025-07-07T01:42:55Z

🔔 2nd Reminder

Hey @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

wpaulino · 2025-07-07T19:28:17Z

lightning/src/ln/channelmanager.rs

@@ -8227,7 +8229,9 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
 		{
 			if chan.is_awaiting_monitor_update() {
 				log_trace!(logger, "Channel is open and awaiting update, resuming it");
-				handle_monitor_update_completion!(self, peer_state_lock, peer_state, per_peer_state, chan);
+				if chan.blocked_monitor_updates_pending() == 0 {
+					handle_monitor_update_completion!(self, peer_state_lock, peer_state, per_peer_state, chan);


Wouldn't we still want to call self.handle_monitor_update_completion_actions so that we can actually unblock any monitor updates that are pending blocked?

Hmm, we don't do so in handle_new_monitor_update's base case, which I was trying to match everywhere. A few tests fail if we change it there (with almost-harmless message ordering changes, so nothing major, but we'd have to fix that), but I'm not convinced its a bug - a channel should never block itself, and the unblocking of the next monitor update in a channel has to come in from a different channel or from an event, not from the channel itself. So that other channel, once it makes progress, should always unblock us.

Yeah that's what I figured, was just being overly-cautious in case we ever introduce such a monitor update that blocks the same channel.

Yea, I think you're right that that change makes sense, but I don't think it makes sense as a backport and we should do it everywhere, not just in the new places, in a separate PR.

lightning/src/ln/channelmanager.rs

wpaulino · 2025-07-08T21:00:13Z

LGTM after squash

TheBlueMatt · 2025-07-08T22:43:36Z

Squashed without further changes.

ldk-reviews-bot · 2025-07-09T17:50:43Z

✅ Added second reviewer: @joostjager

joostjager

I'd ask someone with more background on the channel state machine as a 2nd reviewer.

lightning/src/ln/chanmon_update_fail_tests.rs

lightning/src/ln/channelmanager.rs

valentinewallace

LGTM logic-wise, nice catch. Had to re-read some code to refresh on the paths but it makes sense.

lightning/src/ln/chanmon_update_fail_tests.rs

lightning/src/ln/quiescence_tests.rs

In `handle_new_monitor_update!`, we correctly check that the channel doesn't have any blocked monitor updates pending before calling `handle_monitor_update_completion!` (which calls `Channel::monitor_updating_restored`, which in turn assumes that all generated `ChannelMonitorUpdate`s, including blocked ones, have completed). We, however, did not do the same check at several other places where we called `handle_monitor_update_completion!`. Specifically, after a monitor update completes during reload (processed via a `BackgroundEvent` or when monitor update completes async, we didn't check if there were any blocked monitor updates before completing). Here we add the missing check, as well as an assertion in `Channel::monitor_updating_restored`.

TheBlueMatt · 2025-07-10T23:08:04Z

Pushed some tirivial cleanups:

$ git diff-tree -U2 b47d7db4f1 418d59dff3
diff --git a/lightning/src/ln/chanmon_update_fail_tests.rs b/lightning/src/ln/chanmon_update_fail_tests.rs
index 2a3f3b52ba..c8d408425f 100644
--- a/lightning/src/ln/chanmon_update_fail_tests.rs
+++ b/lightning/src/ln/chanmon_update_fail_tests.rs
@@ -3411,5 +3411,12 @@ fn test_inbound_reload_without_init_mon() {
 }

-fn do_test_blocked_chan_preimage_release(completion_mode: u8) {
+#[derive(PartialEq, Eq)]
+enum BlockedUpdateComplMode {
+	Async,
+	AtReload,
+	Sync,
+}
+
+fn do_test_blocked_chan_preimage_release(completion_mode: BlockedUpdateComplMode) {
 	// Test that even if a channel's `ChannelMonitorUpdate` flow is blocked waiting on an event to
 	// be handled HTLC preimage `ChannelMonitorUpdate`s will still go out.
@@ -3459,5 +3466,5 @@ fn do_test_blocked_chan_preimage_release(completion_mode: u8) {

 	let as_htlc_fulfill_updates = get_htlc_update_msgs!(nodes[0], node_b_id);
-	if completion_mode != 0 {
+	if completion_mode != BlockedUpdateComplMode::Sync {
 		// We use to incorrectly handle monitor update completion in cases where we completed a
 		// monitor update async or after reload. We test both based on the `completion_mode`.
@@ -3470,5 +3477,5 @@ fn do_test_blocked_chan_preimage_release(completion_mode: u8) {
 	assert!(get_monitor!(nodes[1], chan_id_2).get_stored_preimages().contains_key(&payment_hash_2));
 	assert!(nodes[1].node.get_and_clear_pending_msg_events().is_empty());
-	if completion_mode == 1 {
+	if completion_mode == BlockedUpdateComplMode::AtReload {
 		let node_ser = nodes[1].node.encode();
 		let chan_mon_0 = get_monitor!(nodes[1], chan_id_1).encode();
@@ -3487,5 +3494,5 @@ fn do_test_blocked_chan_preimage_release(completion_mode: u8) {
 		reconnect_nodes(a_b_reconnect);
 		reconnect_nodes(ReconnectArgs::new(&nodes[2], &nodes[1]));
-	} else if completion_mode == 2 {
+	} else if completion_mode == BlockedUpdateComplMode::Async {
 		let (latest_update, _) = get_latest_mon_update_id(&nodes[1], chan_id_2);
 		nodes[1]
@@ -3499,6 +3506,7 @@ fn do_test_blocked_chan_preimage_release(completion_mode: u8) {
 	// update_fulfill_htlc + CS is held, even though the preimage is already on disk for the
 	// channel.
-	// Note that in completion_mode 1 we completed the CS dance in `reconnect_nodes` above.
-	if completion_mode != 1 {
+	// Note that when completing as a side effect of a reload we completed the CS dance in
+	// `reconnect_nodes` above.
+	if completion_mode != BlockedUpdateComplMode::AtReload {
 		nodes[1].node.handle_commitment_signed_batch_test(
 			node_a_id,
@@ -3547,7 +3555,7 @@ fn do_test_blocked_chan_preimage_release(completion_mode: u8) {
 #[test]
 fn test_blocked_chan_preimage_release() {
-	do_test_blocked_chan_preimage_release(0);
-	do_test_blocked_chan_preimage_release(1);
-	do_test_blocked_chan_preimage_release(2);
+	do_test_blocked_chan_preimage_release(BlockedUpdateComplMode::AtReload);
+	do_test_blocked_chan_preimage_release(BlockedUpdateComplMode::Sync);
+	do_test_blocked_chan_preimage_release(BlockedUpdateComplMode::Async);
 }

diff --git a/lightning/src/ln/channelmanager.rs b/lightning/src/ln/channelmanager.rs
index b9a0c01c6a..195fbc286e 100644
--- a/lightning/src/ln/channelmanager.rs
+++ b/lightning/src/ln/channelmanager.rs
@@ -8229,13 +8229,9 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
 		{
 			if chan.is_awaiting_monitor_update() {
-				let should_resume = chan.blocked_monitor_updates_pending() == 0;
-				let action_msg = if should_resume {
-					"resuming it"
-				} else {
-					"leaving it blocked due to a blocked monitor update"
-				};
-				log_trace!(logger, "Channel is open and awaiting update, {action_msg}");
-				if should_resume {
+				if chan.blocked_monitor_updates_pending() == 0 {
+					log_trace!(logger, "Channel is open and awaiting update, resuming it");
 					handle_monitor_update_completion!(self, peer_state_lock, peer_state, per_peer_state, chan);
+				} else {
+					log_trace!(logger, "Channel is open and awaiting update, leaving it blocked due to a blocked monitor update");
 				}
 			} else {
diff --git a/lightning/src/ln/quiescence_tests.rs b/lightning/src/ln/quiescence_tests.rs
index e6274f75f0..c2ab17e726 100644
--- a/lightning/src/ln/quiescence_tests.rs
+++ b/lightning/src/ln/quiescence_tests.rs
@@ -255,7 +255,7 @@ fn test_quiescence_waits_for_async_signer_and_monitor_update() {
 	// We have two updates pending:
 	{
-		let chain_monitor = &nodes[0].chain_monitor;
+		let test_chain_mon = &nodes[0].chain_monitor;
 		let (_, latest_update) =
-			chain_monitor.latest_monitor_update_id.lock().unwrap().get(&chan_id).unwrap().clone();
+			test_chain_mon.latest_monitor_update_id.lock().unwrap().get(&chan_id).unwrap().clone();
 		let chain_monitor = &nodes[0].chain_monitor.chain_monitor;
 		// One for the latest commitment transaction update from the last `revoke_and_ack`
@@ -263,9 +263,7 @@ fn test_quiescence_waits_for_async_signer_and_monitor_update() {
 		expect_payment_sent(&nodes[0], preimage, None, false, true);

-		let chain_monitor = &nodes[0].chain_monitor;
 		let (_, new_latest_update) =
-			chain_monitor.latest_monitor_update_id.lock().unwrap().get(&chan_id).unwrap().clone();
+			test_chain_mon.latest_monitor_update_id.lock().unwrap().get(&chan_id).unwrap().clone();
 		assert_eq!(new_latest_update, latest_update + 1);
-		let chain_monitor = &nodes[0].chain_monitor.chain_monitor;
 		// One for the commitment secret update from the last `revoke_and_ack`
 		chain_monitor.channel_monitor_updated(chan_id, new_latest_update).unwrap();

TheBlueMatt · 2025-07-11T16:23:24Z

Only diff since @wpaulino's ACK are the trivial changes above, so landing.

TheBlueMatt · 2025-07-15T17:55:22Z

Backported in #3932

TheBlueMatt added the backport 0.1 label Jul 3, 2025

TheBlueMatt force-pushed the 2025-06-no-mon-compl-with-blocked-updates branch 2 times, most recently from 6bfe0bd to 49ad15c Compare July 3, 2025 01:33

graphite-app bot reviewed Jul 3, 2025

View reviewed changes

lightning/src/ln/quiescence_tests.rs Outdated Show resolved Hide resolved

TheBlueMatt force-pushed the 2025-06-no-mon-compl-with-blocked-updates branch 2 times, most recently from 7759b4d to 95c0696 Compare July 3, 2025 01:36

graphite-app bot reviewed Jul 3, 2025

View reviewed changes

lightning/src/ln/quiescence_tests.rs Show resolved Hide resolved

TheBlueMatt force-pushed the 2025-06-no-mon-compl-with-blocked-updates branch from 95c0696 to ab0dda7 Compare July 3, 2025 01:36

ldk-reviews-bot requested a review from wpaulino July 3, 2025 01:42

wpaulino reviewed Jul 7, 2025

View reviewed changes

wpaulino reviewed Jul 8, 2025

View reviewed changes

lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved

TheBlueMatt force-pushed the 2025-06-no-mon-compl-with-blocked-updates branch from ac695db to b47d7db Compare July 8, 2025 22:43

wpaulino previously approved these changes Jul 8, 2025

View reviewed changes

TheBlueMatt added the weekly goal Someone wants to land this this week label Jul 9, 2025

TheBlueMatt added this to Weekly Goals Jul 9, 2025

ldk-reviews-bot requested a review from joostjager July 9, 2025 17:50

joostjager reviewed Jul 10, 2025

View reviewed changes

lightning/src/ln/chanmon_update_fail_tests.rs Outdated Show resolved Hide resolved

lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved

TheBlueMatt self-assigned this Jul 10, 2025

valentinewallace reviewed Jul 10, 2025

View reviewed changes

lightning/src/ln/chanmon_update_fail_tests.rs Outdated Show resolved Hide resolved

lightning/src/ln/quiescence_tests.rs Show resolved Hide resolved

TheBlueMatt dismissed wpaulino’s stale review via 418d59d July 10, 2025 23:05

TheBlueMatt force-pushed the 2025-06-no-mon-compl-with-blocked-updates branch from b47d7db to 418d59d Compare July 10, 2025 23:05

valentinewallace approved these changes Jul 11, 2025

View reviewed changes

TheBlueMatt merged commit f5dd77c into lightningdevkit:main Jul 11, 2025
28 checks passed

github-project-automation bot moved this to Done in Weekly Goals Jul 11, 2025

TheBlueMatt mentioned this pull request Jul 15, 2025

0.1.5 backports #3932

Merged

TheBlueMatt removed the backport 0.1 label Jul 15, 2025

Only mark all mon updates complete if there are no blocked updates #3907

Only mark all mon updates complete if there are no blocked updates #3907

Uh oh!

Conversation

TheBlueMatt commented Jul 3, 2025

Uh oh!

ldk-reviews-bot commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ldk-reviews-bot commented Jul 5, 2025

Uh oh!

ldk-reviews-bot commented Jul 7, 2025

Uh oh!

wpaulino Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

wpaulino Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wpaulino commented Jul 8, 2025

Uh oh!

TheBlueMatt commented Jul 8, 2025

Uh oh!

ldk-reviews-bot commented Jul 9, 2025

Uh oh!

joostjager left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

valentinewallace left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

TheBlueMatt commented Jul 10, 2025

Uh oh!

TheBlueMatt commented Jul 11, 2025

Uh oh!

Uh oh!

TheBlueMatt commented Jul 15, 2025

Uh oh!

Uh oh!

ldk-reviews-bot commented Jul 3, 2025 •

edited

Loading

codecov bot commented Jul 3, 2025 •

edited

Loading